Back

Forensic Science International: Genetics

Elsevier BV

All preprints, ranked by how well they match Forensic Science International: Genetics's content profile, based on 24 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Calculating likelihoods and likelihood ratios at SNPs-based mixtures. A reappraisal of the binomial inference, as applied to forensic identity tests

Pascali, V. L.

2021-02-08 genetics 10.1101/2021.02.08.430218 medRxiv
Top 0.1%
97.4%
Show abstract

Single nucleotide polymorphisms (SNPs) are useful forensic markers. When a SNPs-based forensic protocol targets a body fluid stain, it returns elementary evidence regardless of the number of individuals that might have contributed to the stain deposition. Therefore, drawing inference from a mixed stain with SNPs is different than drawing it while using multinomial polymorphisms. We here revisit this subject, with a view to contribute to a fresher insight into it. First, we manage to model conditional semi-continuous likelihoods in terms of matrices of genotype permutations vs number of contributors (NTZsc). Secondly, we redefine some algebraic formulas to approach the semi-continuous calculation. To address allelic dropouts, we introduce a peak height ratio index ( h, or: the minor read divided by the major read at any NGS-based typing result) into the semi-continuous formulas, for they to act as an acceptable proxy of the split drop (Haned et al, 2012) model of calculation. Secondly, we introduce a new, empirical method to deduct the expected quantitative ratio at which the contributors of a mixture have originally mixed and the observed ratio generated by each genotype combination at each locus. Compliance between observed and expected quantity ratios is measured in terms of (1-{chi}2) values at each state of a locus deconvolution. These probability values are multiplied, along with the h index, to the relevant population probabilities to weigh the overall plausibility of each combination according to the quantitative perspective. We compare calculation performances of our empirical procedure (NITZq) with those of the EUROFORMIX software ver. 3.0.3. NITZq generates LR values a few orders of magnitude lower than EUROFORMIX when true contributors are used as POIs, but much lower LR values when false contributors are used as POIs. NITZ calculation routines may be useful, especially in combination with mass genomics typing protocols.

2
Assessment of DNA methylation from a single genomic region of ELOV2 is sufficient to predict chronological age

Zhu, B.; Li, D.; Han, G.; Yao, X.; Gu, H.; Liu, T.; Liu, L.; Dai, J.; Liu, I. Z.; Liang, Y.; Zheng, J.; Sun, Z.; Lin, H.; Wang, W.; Liu, N.; Yu, H.; Shi, M.; Shen, G.; Qu, L.

2024-12-15 genetics 10.1101/2024.12.10.627662 medRxiv
Top 0.1%
97.1%
Show abstract

Estimation of chronological age is particularly informative in forensic contexts. Assessment of DNA methylation status allows for the prediction of age, though the accuracy and ease of manipulation may vary across different models. In this study, we started with a carefully designed discovery cohort recruiting more elderly subjects than other age categories, to diminish the effect of epigenetic drifting. We analyzed DNA methylation from a single genomic region of ELOV2, which was sufficient to construct an age-prediction model comprising 15 CpG sites. This model is further validated by an independent cohort as well as a multi-center test using trace dried bloodstains. The nature of our analytical pipeline, when combined the assessment of a single genomic locus with high-throughput sequencing, can easily be scaled up with low cost. Taken together, we propose a new age-prediction model featuring accuracy, ease of manipulation, high-throughput, and low cost. This model can be readily applied in both classic and newly emergent forensic contexts that require age estimation.

3
Likelihood Ratios for physical traits in forensicinvestigations

Marsico, F.; Egeland, T.

2024-05-30 genetics 10.1101/2024.05.25.595720 medRxiv
Top 0.1%
96.9%
Show abstract

Recent years have seen significant advances in DNA phenotyping, which predicts the physical traits of an unknown person, such as hair, eyes, and skin color, using DNA data. This technique is increasingly used in forensic investigations to identify missing persons, disaster victims, and suspects of crimes. A key contribution of DNA phenotyping is that it allows researchers to search through lists of individuals with similar characteristics, often gathered from testimonies, photographs, and social media data. However, despite their growing relevance, current methods lack comprehensive mathematical models to calculate likelihood ratios that accurately assess the statistical weight of evidence. Our work bridges this gap by developing new likelihood ratio models, validated through computational simulations. In addition, we demonstrate the ability of these models to improve forensic investigations in real-world scenarios. Furthermore, we introduce the R package forensicolors, freely available on CRAN, to facilitate the application of the methodologies developed.

4
Are genetically defined "metapopulations" self-evident in YHRD?

Olsen, T. O. S.; Andersen, M. M.; Curran, J.; Krawczak, M.; Caliebe, A.

2026-02-10 genetics 10.64898/2026.02.07.704579 medRxiv
Top 0.1%
94.3%
Show abstract

In forensic genetics, the evidential value of a match between the Y-chromosomal short tandem repeat (Y-STR) profiles of a trace and a suspect is typically quantified by the frequency of the profile in a population database, particularly the Y-chromosomal Haplotype Reference Database (YHRD). However, for this approach of obtaining a match probability to be valid, the database population must be representative of all plausible alternative trace donors in a given case. Since appropriately defining such a suspect population can be difficult, YHRD highlights so-called metapopulations that comprise profiles from different, geographically dispersed populations with presumed shared ancestry. We investigated whether such metapopulations are self-evident in the current version of YHRD. To this end, we performed classical cluster analysis using allele dissimilarity as a measure of pairwise distance between Y-STR profiles. Our analyses revealed only a weak genetic structure in YHRD the extent of which was inversely proportional to the respective marker mutation rate. This suggests that YHRD cannot be divided into clearly distinguishable subgroups based solely on the genetic information it contains, at least not into subgroups that would correspond closely to the metapopulations highlighted in the database itself. If profile frequencies in metapopulations are to continue to be equated with match probabilities, then a clearer definition of metapopulations and a better justification of their use in forensics are needed.

5
Mispitools: An R Package for Comprehensive Statistical Methods in Kinship Inference

Marsico, F.

2024-08-19 genetics 10.1101/2024.08.16.608307 medRxiv
Top 0.1%
93.6%
Show abstract

The search for missing persons is a complex process that involves the comparison of data from two entities: unidentified persons (UP), who may be alive or deceased, and missing persons (MP), whose whereabouts are unknown. Although existing tools support DNA-based kinship analyses for the search, they typically do not integrate or statistically evaluate diverse lines of evidence collected throughout the investigative process. Examples of alternative lines of evidence are pigmentation traits, biological sex, and age, among others. The package Mispitools fills this gap by providing comprehensive statistical methods adapted to a holistic investigation workflow. Mispitools systematically assesses the data from each investigative stage, computing the statistical weight of various types of evidence through a likelihood ratio (LR) approach. It also provides models for combining obtained LRs. Furthermore, Mispitools offers customized visualizations and a user-friendly interface, broadening its applicability among forensic practitioners and genealogical researchers.

6
Using all available evidence to solve kinship cases

Egeland, T.; Marsico, F.

2025-05-08 genetics 10.1101/2025.05.03.652046 medRxiv
Top 0.1%
92.4%
Show abstract

Kinship cases, ranging from standard paternity tests to complex disaster victim identifications, are typically evaluated using likelihood ratios (LR) based on forensic genetic markers. However, in some contexts, genetic information alone is not enough to reach conclusive results. This is common when establishing distant familial connections using large DNA-databases, or even in simple cases such as determining which individual is the parent and which is the child in a relationship pair. Although forensic practitioners frequently incorporate additional evidence (SE), such as age, biological sex, or phenotypic traits, in these cases, this integration typically occurs informally, without rigorous probability estimation, compromising procedural transparency and reliability. Here, we present a comprehensive methodological framework that formally synthesizes forensic DNA evidence (FDE) with SE through Markov chain models and customized transition matrices designed for various biological traits. This approach generates combined likelihood assessments expressed as LRs or posterior probabilities. Validation through simulated and real-world case studies demonstrates that systematic incorporation of SE improves resolution accuracy in kinship determinations. To facilitate adoption, we have implemented this methodology in mispitools, an open-source R package.

7
Secondary DNA transfer on denim using a human blood analogue

Ridings, R.; Gabriel, A.; Elliott, C. I.; Shafer, A.

2021-11-25 genetics 10.1101/2021.11.25.470033 medRxiv
Top 0.1%
90.5%
Show abstract

DNA quantification technology has increased in accuracy and sensitivity, now allowing for detection and profiling of trace DNA. Secondary DNA transfer occurs when DNA is deposited via an intermediary source (e.g. clothing, tools, utensils). Multiple courtrooms have now seen secondary transfer introduced as an explanation for DNA being present at a crime scene, but sparse experimental studies mean expert opinions are often limited. Here, we used bovine blood and indigo denim substrates to quantify the amount of secondary DNA transfer and quality of STRs under three different physical contact scenarios: passive, pressure, and friction. We showed that the DNA transfer was highest under a friction scenario, followed by pressure and passive treatments. The STR profiles showed a similar, albeit less pronounced trend, with correctly scored alleles and genotype completeness being highest under a friction scenario, followed by pressure and passive. DNA on the primary substrate showed a decrease in concentration and genotype completeness both immediately and at 24 hours, suggestive of a loss of DNA during the primary transfer. The majority of secondary transfer samples amplified less than 50% of STR loci regardless of contact type. This study showed that while DNA transfer is common between denim, this is not manifested in full STR profiles. We discuss the possible technical solutions to partial profiles from trace DNA, and more broadly the ubiquity of secondary DNA transfer.

8
Open-Access STRS Database Of Populations From The 1000 Genomes Project Using High Coverage Phase 3 Data

Frontanilla, T. S.; Valle Silva, G.; Ayala, J.; Mendes, C. T.

2021-09-07 genetics 10.1101/2021.09.06.459168 medRxiv
Top 0.1%
86.0%
Show abstract

Accurate STR genotyping from next-generation sequencing (NGS) data has been challenging. Haplotype inference and phasing for STRs (HipSTR) was specifically developed to deal with genotyping errors and obtain reliable STR genotypes from whole-genome sequencing datasets. The objective of this investigation was to perform a comprehensive genotyping analysis of a set of STRs of broad forensic interest from the 1000 Genomes populations and release a reliable open-access STR database to the forensic genetics community. A set of 22 STR markers were analyzed using the CRAM files of the 1000 Genomes Project Phase 3 high-coverage (30x) dataset generated by the New York Genome Center (NYGC). HipSTR was used to call genotypes from 2,504 samples from 26 populations organized into five groups: African, East Asian, European, South Asian, and admixed American. The D21S11 marker could not be detected in the present study. Moreover, the Hardy-Weinberg equilibrium analysis, coupled with a comprehensive analysis of allele frequencies, revealed that HipSTR could not identify longer Penta E (and Penta D at a lesser extent) alleles. This issue is probably due to the limited length of sequencing reads available for genotype calling, resulting in heterozygote deficiency. Notwithstanding that, AMOVA, a clustering analysis using STRUCTURE, and a Principal Coordinates Analysis revealed a clear-cut separation between the four major ancestries sampled by the 1000 Genomes Consortium (AFR, EUR, EAS, SAS). Meanwhile, the AMOVA results corroborated previous reports that most of the variance is (97.12%) observed within populations. This set of analyses revealed that except for larger Penta D and Penta E alleles, allele frequencies and genotypes defined by HipSTR from the 1000 Genomes Project phase 3 data and offered as an open-access database are consistent and highly reliable.

9
Comparison of three quantitative approaches for estimating time-since-deposition from autofluorescence and morphological profiles of cell populations from forensic biological samples

Gentry, A. E.; Ingram, S.; Philpott, M. K.; Archer, K. J.; Ehrhardt, C. J.

2023-04-20 molecular biology 10.1101/2023.04.19.537512 medRxiv
Top 0.1%
82.9%
Show abstract

Determining when DNA recovered from a crime scene transferred from its biological source, i.e., a samples time-since-deposition (TSD), can provide critical context for biological evidence. Yet, there remains no analytical techniques for TSD that are validated for forensic casework. In this study, we investigate whether morphological and autofluorescence measurements of forensically-relevant cell populations generated with Imaging Flow Cytometry (IFC) can be used to predict the TSD of touch or trace biological samples. To this end, three different prediction frameworks for estimating the number of day(s) for TSD were evaluated: the elastic net, gradient boosting machines (GBM), and generalized linear mixed model (GLMM) LASSO. Additionally, we transformed these continuous predictions into a series of binary classifiers to evaluate the potential utility for forensic casework. Results showed that GBM and GLMM-LASSO showed the highest accuracy, with mean absolute error estimates in a hold-out test set of 29 and 21 days, respectively. Binary classifiers for these models correctly binned 94-96% and 98-99% of the age estimates as over/under 7 or 180 days, respectively. This suggests that predicted TSD using IFC measurements coupled to one or, possibly, a combination binary classification decision rules, may provide probative information for trace biological samples encountered during forensic casework.

10
SNP assays for DVI: cost, time, and performance information for decision-makers

Gettings, K. B.; Tillmar, A.; Marshall, C.; Sturk-Andreaggi, K.

2024-05-11 molecular biology 10.1101/2024.05.10.593619 medRxiv
Top 0.1%
74.3%
Show abstract

In mass disaster events, forensic DNA laboratories may be called upon to quickly pivot their operations toward identifying bodies and reuniting remains with family members. Ideally, laboratories have considered this possibility in advance and have a plan in place. Compared with traditional short tandem repeat (STR) typing, single nucleotide polymorphisms (SNPs) may be better suited to these disaster victim identification (DVI) scenarios due to their small genomic target size, resulting in an improved success rate in degraded DNA samples. As the landscape of technology has shifted toward DNA sequencing, many forensic laboratories now have benchtop instruments available for massively parallel sequencing (MPS), facilitating this operational pivot from routine forensic STR casework to DVI SNP typing. Herein, we review the commercially available SNP sequencing assays amenable to DVI, we use data simulations to explore the potential for kinship prediction from SNP panels of varying size, and we give an example DVI scenario as context for presenting the matrix of considerations: kinship predictive potential, cost, and throughput of current SNP assay options. This information is intended to assist laboratories in choosing a SNP system for disaster preparedness. Highlights3 to 5 bullet points (maximum 100 characters per bullet point, including spaces). Each bullet point should be a full sentence and should outline the key contributions of your manuscript and how it impacts forensic science. O_LISingle nucleotide polymorphisms (SNPs) are useful in disaster victim identification (DVI). C_LIO_LISNP panels amenable to human identification and extended kinship are described. C_LIO_LISimulations demonstrate the potential for kinship prediction from SNP panels of varying size. C_LIO_LIKinship predictive potential, cost, and throughput are presented for an example DVI scenario. C_LIO_LIInformation is intended to assist laboratories in choosing a SNP system for disaster preparedness. C_LI

11
Rapid on-site universal vertebrate species identification via multi-barcode nanopore sequencing

Patterson, E. C.; Morrison-Lanjouw, S.; Jobling, M. A.; Wetton, J. H.

2025-06-08 genetics 10.1101/2025.06.04.657926 medRxiv
Top 0.1%
73.4%
Show abstract

The growing illegal wildlife trade (IWT) threatens biodiversity and is a conduit for zoonotic disease, yet its risk of detection is low. Once processed, trafficked species are difficult to identify morphologically, and currently require DNA-based approaches that are time-consuming, costly, and lab-based. There is thus a need for a rapid, cheap, on-site method for species identification. We describe VeRIF-ID (Vertebrate Rapid In-Field Identification via DNA), a method that employs simultaneous on-site nanopore sequencing of four different mitochondrial DNA barcodes. Primers were designed to produce short amplicons to aid analysis of damaged DNA, and to be effective over a broad taxonomic range of vertebrates from lamprey to chimpanzee. Validation demonstrated species-level identification in 91% of 83 tested species, and genus/tribe-level identification of the remaining species (which are also problematic with existing approaches). DNA extraction, PCR and library preparation steps were simplified and optimised so that sampling to species identification takes <3 h for a single sample. Species components are identifiable non-quantitatively in prepared mixtures of muscle tissue from up to five species, and laboratory tests of Traditional East Asian Medicine samples reveal DNA from species including critically endangered saiga antelope and black rhinoceros. In conjunction with a portable BentoLab device the necessary equipment and reagents are easily portable, and we apply the method to analyse seized bushmeat and fish samples within an airport customs zone, identifying mammal and fish species in 15 samples within 6 h. The initial equipment costs for VeRIF-ID are [~]$8K, and the cost per sample of [~]$10-48 (depending on set-up), considerably cheaper than current conventional lab-based approaches. The method requires only basic hands-on skills. Ongoing trials with potential end-users will focus on establishing forensic reporting criteria prior to casework implementation. Future development of user-friendly bioinformatic interfaces will aim to fully democratise species identification.

12
STRategy: A support system for collecting and analyzing short tandem repeats for forensic science

Kulthammanit, N.; Sukawutthiya, P.; Noh, H.; Vongpaisarnsin, K.; Wichadakul, D.

2023-02-21 genetics 10.1101/2023.02.20.529208 medRxiv
Top 0.1%
72.6%
Show abstract

Short tandem repeats (STRs) are short repeated sequences commonly found in the human genome. They provide many advantages to forensic sciences, such as identifying individuals, estimating the likelihood of kinship, and analyzing mixtures. Next-generation sequencing (NGS) technologies, e.g., ForenSeq Signature Prep, have been proposed for sequencing STRs, obtaining the sequence of each locus and SNPs, and inferring length-based alleles. However, even though the sequenced STRs from ForenSeq offer more insights into the STRs, which lead to the genetic analysis of population and sub-population structures, no open-source software platform enables the collection and management of STR data from NGS and incorporates related analysis tools in one place. Here, we introduce STRategy, a standalone web-based application supporting essential STR data management and analysis capabilities. The analyzed data will be visualized in various forms, for example, charts, maps, and pattern alignments. The system implemented a role-based access control that allows users to search or access specific data depending on their responsibilities. It enables public users to search for data. In addition, they can view statistical data, for example, detailed alleles and genetic variation. Lab users can add, update, and see the information of individuals and explore pattern alignments for a specific locus within the population. Administrators can customize the system, for example, configure maps according to the samples geographic data, and manage reference STR repeat motifs. We designed and developed the STRategy using software engineering principles for flexible extension and easy deployment utilizing the Docker container. The source code is publicly available at https://github.com/cucpbioinfo/STRategy. Also, we deployed a showcase system on a cloud computing service where its URL is included on the GitHub repository. The current version only supports the ForenSeq sample detail report files.

13
Quantification of Human DNA from Century-Old Archived FFPE Samples for Retrospective Genomic Studies

Zvenigorosky, V.; Gonzalez, A.; Broisin, L.; Fausser, J.-L.; Jeanjean, N.; Hluszko Pontet, A.; Cannet, C.; Lamy, C. M.; Keyser, C.; Bonah, C.

2025-10-17 genetics 10.1101/2025.10.17.681771 medRxiv
Top 0.1%
70.8%
Show abstract

BackgroundFormalin-fixed, paraffin-embedded (FFPE) tissue archives are an invaluable resource for genomic research, offering the potential to link genomic data to long-term clinical outcomes. Their utility has however been limited by the degradation caused by fixation and long-term storage, particularly for samples archived for many decades. MethodsThis study evaluates a cohort of 79 FFPE tissue blocks collected and archived in 1973 from the Strasbourg Pathological Tissue Archive (SPTA), and a cohort of 51 FFPE tissue blocks from the Geneva Brain Bank (GBB), collected between 1928 and 1971. DNA was quantified using a forensic-grade quantitative PCR (qPCR) assay (QIAGEN Investigator Quantiplex Pro Kit on a Rotor-Gene Q) to precisely measure human DNA concentration, assess degradation, and detect PCR inhibition. ResultsA high proportion of the samples yielded human DNA fragments of 80bp and 95bp and very few fragments of 205bp. All organ samples treated with Bouin liquid (n=12) gave poor results, but among samples fixed in formalin (n=117), 58.1% showed over 0.25ng/{micro}L of 80bp human DNA fragments and 37.6% over 1ng/{micro}L. We propose a model to describe the decay of these samples and estimate the proportion of samples that should yield at least a 0.25ng/{micro}L concentration of fragments over 100bp to 45.1%. ConclusionsThis work demonstrates that automated extraction methods optimized for FFPE allow for the recovery of usable material even in century-old archived samples with inconsistent conditions of conservation. Our data suggests that the time spent in storage is much less influential on DNA quality than initial fixation time. Crucially, given the fragmented nature of the material recovered (an expected result), future analyses of this material will have to be conducted using next-generation sequencing (NGS) technologies and approaches that rely on short fragments.

14
The mini yet mighty stapes: a comparison of ancient DNA yields among ossicles and the petrous bone

Saglıcan, E.; Sevkar, A.; Kazancı, D. D.; Yorulmaz, S.; Vural, K. B.; Cakan, G.; Adiloglu, G.; Duru, G.; Hassett, B.; Kayacan, N.; Özdogan, A. E.; Götherström, A.; Özer, F.; Erdal, O. D.; Erdal, Y. S.; Somel, M.

2025-07-18 genetics 10.1101/2025.07.17.664655 medRxiv
Top 0.1%
70.2%
Show abstract

The petrous bone is considered the most efficient source of endogenous DNA across skeletal tissues in ancient DNA research as well as in forensic work. Recently, ancient DNA (aDNA) in auditory ossicle bones was shown to be comparably well-preserved as in the petrous, although no attempt was made to distinguish among the three ossicle bones. In this study, we used a total of 114 human ossicle- and petrous-derived sequencing libraries from similar contexts (c.10,000 BP - 7,000 BP Anatolia), including 34 matched libraries prepared from the same individuals ossicle and petrous bones. Our results suggest that endogenous human aDNA preservation in the stapes is on average two times higher than in the petrous bone; it also tends to be higher than in the malleus and incus. Similarly, aDNA fragment lengths were higher in the stapes than in the petrous, whereas postmortem damage, clonality and contamination rates were comparable. Despite being the smallest bone in the human skeleton, the stapes may be the most optimal aDNA source yet identified.

15
Comparative Evaluation of Targeted RNA Sequencing Protocols for Gene Expression Quantification With and Without Unique Molecular Indices (UMIs)

Gosch, A.; Courts, C.

2025-01-27 molecular biology 10.1101/2025.01.27.635010 medRxiv
Top 0.1%
68.2%
Show abstract

Interest in forensic RNA analysis has increased over the last years. RNA molecules present in forensic samples can accurately be quantified via quantitative PCR (qPCR), however, due to the limited number of markers that can be assayed simultaneously per reaction, qPCR is less suitable for applications requiring gene expression quantification of large marker sets. Few years ago, massively parallel targeted RNA-sequencing (targRNAseq) allowing to simultaneously and accurately quantify several hundreds of markers has been added to the forensic genetic tool set. However, typical targRNAseq protocols include a multiplex-PCR-step to amplify selected targets which potentially introduces bias and limits accurate gene expression quantification. Unique Molecular Indices (UMIs) have been invented to overcome this limitation and have been implemented in protocols from some vendors. In this study, we compared two targeted RNAseq protocols assaying expression of a set of 121 forensically relevant mRNA biomarkers: The Ion Ampliseq targeted RNA sequencing panel (Thermo Fisher Scientific), which employs a multiplex-PCR without the use of UMIs, and the QIAseq targeted RNA panel (QIAGEN), which uses UMIs prior to multiplex amplification. Both protocols were tested on replicated samples and dilution series and compared with respect to sensitivity and accuracy of gene expression quantification. The UMI-based protocol exhibited decreased sensitivity in comparison to the non-UMI-based alternative, however, making use of UMI technology greatly improved gene expression quantification accuracy. We thus recommend the use of UMI-based protocols for targeted RNA sequencing for applications requiring accurate gene expression quantification.

16
Gene expression profiling for forensic age assessment of porcine skin wounds

Baekgaard, C.; Skovgaard, K.; Hansen, M. H.; Jensen, H. E.; Barington, K.

2025-11-10 genetics 10.1101/2025.11.06.686914 medRxiv
Top 0.1%
64.9%
Show abstract

Determining the age of wounds is of utmost importance in both veterinary and human forensic pathology. The aim of this study was to design and optimize quantitative polymerase chain reaction (qPCR) primers for use on degraded RNA samples obtained in veterinary forensic cases, hereby ensuring assay robustness. Moreover, the aim was to evaluate if an expression signature, based on optimized short-amplicon primers, was able to differentiate porcine experimental granulation tissue according to age and if this could be used for wound age assessment in veterinary forensic cases. Initially, 12 samples of experimental granulation tissue (n=6) and skin (n=6) from two pigs were deliberately exposed to RNA degrading conditions before being stored in RNAlater. A panel of 24 robust primers were selected based on the intentionally degraded samples. Granulation tissue (5, 10, 15, 20, 25, 30, and 35 days of age) (n=94) and control skin (n=47) from 47 experimental pigs was sampled and stored in RNAlater. Furthermore, granulation tissue and fibrous scar tissue were sampled from 14 veterinary forensic cases. Microfluidic qPCR was performed to evaluate the gene expression of 24 genes. An expression signature of 14 genes reflected the age of the experimental wounds. The 5-day old wounds displayed the biggest divergence from the control skin. As the granulation tissue matured, gene expression gradually approached the levels observed in intact skin. The forensic samples clustered somewhat separately from the experimental samples. In conclusion, granulation tissue from the experimental wounds displayed a time-dependent expression profile based on 14 short-amplicon primers suitable for use with low-quality RNA. However, an expression profile of 14 genes cannot be used as the sole method for forensic age assessments of porcine wounds.

17
Developmental Validation of the Illumina Infinium Assay using the Global Screening Array (GSA) on the iScan System for use in Forensic Laboratories

Russell, D. A.; Gorden, E. M.; Peck, M. A.; Neal, C. M.; Heaton, M. C.; Bouchet, J.; Koeppel, A. F.; Ciuzio, E.; Turner, S. D.; Reedy, C. R.

2022-10-13 genomics 10.1101/2022.10.10.511614 medRxiv
Top 0.1%
60.6%
Show abstract

Microarray processing, which interrogates hundreds of thousands of single nucleotide polymorphisms (SNPs) across the human genome, has recently gained traction in forensics due to its use in forensic genetic genealogy, which is based on analysis using SNPs to compare distant relatives in publicly curated databases for the purposes of developing investigative leads or identifying human remains. To date, there has been no published developmental validation of microarray processing using the Scientific Working Group on DNA Analysis Methods (SWGDAM) Validation Guidelines for DNA Analysis Methods and Federal Bureau of Investigation Quality Assurance Standards. Validation of these methods are warranted to identify samples suitable for microarray analysis and to assess the quality of the data obtained prior to upload to genealogical databases. In this study, we validated the Global Screening Array (GSA) for use in forensic investigations according to SWGDAM guidelines, including the following studies: precision and accuracy, sensitivity, contamination, degradation, species specificity, mock case-type samples, mixtures, repeatability and reproducibility, and stability. Results indicated accurate genotype calls with SNP call rates >95% at DNA input as low as 0.20 ng. In addition to SNP call rate, we developed interpretation thresholds for signal intensity and heterozygosity to allow for sample quality assessment and identification of highly degraded and/or non-human DNA. This study demonstrates that high quality data can be generated from multiple sample types, including mock forensic evidence that simulated the challenges that are often encountered in forensic cold cases.

18
Archaeogenomic and Bioinformatic Analysis of the Columbus Lineage: Evidence from the Counts of Gelves.

Navarro Vera, I.; Bonilla, A.; Tirapu, M.; Albert, M.; Jimenez, P. P.; Herranz-Rodrigo, D.; Cruz-Alcazar, R.; Garcia, C.; Yravedra Sainz de los Terreros, J.

2026-04-04 genetics 10.64898/2026.04.01.715912 medRxiv
Top 0.1%
55.4%
Show abstract

The geographical and familial origins of Christopher Columbus have remained a subject of intense historiographical debate for over five centuries. Despite numerous hypotheses, empirical genetic evidence capable of resolving his ancestral history or place of birth has been absent from the literature until now. This study presents the third stage of the first forensic genetic analysis performed on skeletal remains belonging to several direct descendants of Columbus, spanning the 16th to 18th centuries. By applying Massively Parallel Sequencing (MPS) to analyse autosomal, X- and Y- chromosome DNA markers, and integrating the results with multidisciplinary evidence from historical, genealogical, archaeological, and anthropological research implicated in this project, the identification of several individuals founded in the Crypt of Santa Maria de Gracia located in Gelves (Sevilla, Spain) has been achieved. The analysis of their biological relatedness enabled the reconstruction of kinship networks among the individuals interred in the crypt, which, when interpreted in the context of documented genealogical lineages, provides indirect but consistent evidence pointing toward the debated origin of the discoverer.

19
DNA methylation-based forensic age estimation in human bone

Gopalan, S.; Gaige, J.; Henn, B. M.

2019-10-11 genetics 10.1101/801647 medRxiv
Top 0.1%
54.7%
Show abstract

DNA methylation is an epigenetic modification of cytosine nucleotides that represents a promising suite of aging markers with broad potential applications. In particular, determining an individuals age from their skeletal remains is an enduring problem in the field of forensic anthropology, and one that epigenetic markers are particularly well-suited to address. However, all DNA methylation-based age prediction methods published so far focus on tissues other than bone. While high accuracy has been achieved for saliva, blood and sperm, which are easily accessible in living individuals, the highly tissue-specific nature of DNA methylation patterns means that age prediction models trained on these particular tissues may not be directly applicable to other tissues. Bone is a prime target for the development of DNA methylation-based forensic identification tools as skeletal remains are often recoverable for years post-mortem, and well after soft tissues have decomposed. In this study, we generate genome-wide DNA methylation data from 32 individual bone samples. We analyze this new dataset alongside published data from 133 additional bone donors, both living and deceased. We perform an epigenome-wide association study on this combined dataset to identify 108 sites of DNA methylation that show a significant relationship with age (FDR < 0.05). We also develop an age-prediction model using lasso regression that produces highly accurate estimates of age from bone spanning an age range of 49-112 years. Our study demonstrates that DNA methylation levels at specific CpG sites can serve as powerful markers of aging, and can yield more accurate predictions of chronological age in human adults than morphometric markers.

20
Likelihood Ratios Given Activity-Level Propositions for DNA Transfer Evidence: Practical Implementation and Simulation Studies Using the HaloGen Engine (Part II)

Gill, P.; Bleka, O.

2026-02-09 genetics 10.64898/2026.02.06.703509 medRxiv
Top 0.1%
54.2%
Show abstract

The interpretation of findings of low-template DNA given activity-level propositions requires robust statistical models capable of accommodating substantial inter-laboratory and case-specific variability. This paper presents the practical implementation of HaloGen, an open-source hierarchical Bayesian framework for calculating activity-level likelihood ratios (LRs) from DNA quantity data. We compare three modelling approaches derived from the framework: a Group model, which combines data across laboratories, a hierarchically informed Lab-Bayes model, and a standalone, laboratory specific Lab-Vague model. Through a series of simulation studies, we demonstrate that evidential strength is highly sensitive not only to DNA quantity but also to case context, particularly the assumed number of offenders (NS). We further show that inter-laboratory differences in DNA recovery and dropout can lead to materially different LRs, making unvalidated use of pooled or external data potentially misleading. To address practical implementation, we propose a minimum-effort validation pathway for laboratories wanting to report findings given activity level propositions. Our results indicate that a small number of direct/secondary transfer experiments (n {approx} 6- 12) are sufficient to obtain conservative LRs compared with a generic population model. Finally, these results clarify how contextual assumptions enter mathematically into activity-level inference, demonstrating that confirmation bias can arise naturally from unexamined modelling choices and underscoring the importance of transparent, explicit specification of propositions and parameters.